Kent Ridge Road , Singapore 119260 TR C 5 / 0 7 ICRA : Effective Semantics for Ranked XML Keyword Search
نویسندگان
چکیده
Keyword search is a user-friendly way to query XML databases. Most previous efforts in this area focus on keyword proximity search in XML based on either tree data model or graph (or digraph) data model. Tree data model for XML is generally simple and efficient for keyword proximity search. However, it cannot capture connections such as ID references in XML databases. In the contrast, techniques based on graph (or digraph) data model capture connections, but are generally inefficient to compute. In this paper, we propose interconnected object trees model for keyword search to achieve the efficiency of tree model and meanwhile to capture the connections such as ID references in XML by fully exploiting the property and schema information of XML databases. In particular, we propose ICA (Interested Common Ancestor) semantics to find all predefined interested objects that contain all query keywords. We also introduce novel IRA (Interested Related Ancestors) semantics to capture the conceptual connections between interested objects and include more objects that only contain some query keywords. Then, a novel ranking metric, RelevanceRank, is studied to dynamically assign higher ranks to objects that are more relevant to a given keyword query according to the conceptual connections in IRAs. We design and analyze efficient algorithms for keyword search based on our data model; and experiment results show our approach is efficient and outperforms most existing systems in terms of result quality. A prototype of our ICRA system (ICRA = ICA + IRA) on the updated 321M DBLP data is available at http://xmldb.ddns.comp.nus.edu.sg/.
منابع مشابه
Lower Kent Ridge Road , Singapore 119260 TR B 9 / 0 5 Algebra and the Formal Semantics of GLASS Wei NI and Tok Wang LING September 200 5
In database world, it is common to translate a query language into an algebra for the purpose of precisely defining the formal semantics of a query language and doing query optimization later. In this paper, we examine the scenario of graphical XML query languages, focus on their expressive power and present the underlying algebra of our graphical XML query language. Compared with various previ...
متن کاملDemonstrating Effective Ranked XML Keyword Search with Meaningful Result Display
In this paper, we demonstrate an effective ranked XML keyword search with meaningful result display. Our system, named ICRA, recognizes a set of object classes in XML data for result display, defines the matching semantics that meet user’s search needs more precisely, captures the ID references in XML data to find more relevant results, and adopts novel ranking schemes. ICRA achieves both high ...
متن کاملUltra-low-k materials based on nanoporous fluorinated polyimide with well-defined pores via the RAFT-moderated graft polymerization process
Y. W. Chen, W. C. Wang, W. H. Yu, E. T. Kang,* K. G. Neoh, R. H. Vora, C. K. Ong and L. F. Chen Department of Chemical Engineering, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260. E-mail: [email protected]; Fax: (65) 67791936; Tel: (65) 68742189 Department of Materials Science, National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260 Department of P...
متن کاملICRA: Effective Semantics for Ranked XML Keyword Search
Keyword search is a user-friendly way to query XML databases. Most previous efforts in this area focus on keyword proximity search in XML based on either tree data model or graph (or digraph) data model. Tree data model for XML is generally simple and efficient for keyword proximity search. However, it cannot capture connections such as ID references in XML databases. In the contrast, technique...
متن کاملThe Needs and Benefits of Applying Textual Data Mining within the Product Development Process
Application The Needs and Benefits of Applying Textual Data Mining within the Product Development Process Rakesh Menon1,∗,†, Loh Han Tong1,2, S. Sathiyakeerthi2, Aarnout Brombacher1,3 and Christopher Leong4 1Design Technology Institute, 10 Kent Ridge Crescent, Singapore 119260 2Department of Mechanical Engineering, National University of Singapore, Singapore 119260 3Technische Universitait Eind...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007